Conversation
✅ Deploy Preview for aya-rs-docs ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Pull request overview
This PR adds a zero-copy API for reading perf event samples to avoid copying sample data from the ring buffer, improving performance for high-throughput scenarios. It introduces new types (RawEvents, RawSample, RawSampleData) that provide direct access to the mmap'd buffer, and adds a benchmark feature with helpers for microbenchmarking.
Changes:
- Adds
read_events_raw()method that returns an iterator yielding zero-copy samples - Introduces benchmark infrastructure gated behind a "bench" feature flag
- Extends test/mock syscall infrastructure to support benchmarks
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| aya/src/maps/perf/perf_buffer.rs | Core zero-copy implementation with RawEvents iterator, RawSample/RawSampleData types, bench module with BenchBuffer helper, and tests |
| aya/src/maps/perf/perf_event_array.rs | Public API wrapper exposing read_events_raw() on PerfEventArrayBuffer |
| aya/src/sys/mod.rs | Extends test-only syscall mocking to also work with bench feature |
| aya/src/sys/fake.rs | Extends fake syscall infrastructure to support bench feature |
| aya/src/lib.rs | Extends MockableFd test infrastructure to support bench feature |
| aya/benches/perf_buffer.rs | New benchmark comparing copy vs zero-copy performance |
| aya/Cargo.toml | Adds criterion dev-dependency and bench feature configuration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pub(crate) fn read_events_raw(&mut self) -> Result<RawEvents<'_>, PerfBufferError> { | ||
| let header = self.buf().as_ptr(); | ||
| let base = unsafe { header.byte_add(self.page_size) }; | ||
| let head = unsafe { (*header).data_head } as usize; |
There was a problem hiding this comment.
Missing memory fence after reading data_head in read_events_raw. The existing read_events method at line 396 reads data_head, and the code at line 431 uses a SeqCst fence before writing data_tail. The newly added readable method at line 318 correctly uses an Acquire fence after reading data_head. The read_events_raw method should also include an acquire fence after reading data_head at line 440 to ensure proper memory ordering with the kernel writer, similar to how it's done in readable.
| let head = unsafe { (*header).data_head } as usize; | |
| let head = unsafe { (*header).data_head } as usize; | |
| atomic::fence(Ordering::Acquire); |
| /// Zero-copy view into a sample payload. | ||
| #[derive(Debug, Clone, Copy, PartialEq, Eq)] | ||
| pub struct RawSampleData<'a> { | ||
| first: &'a [u8], | ||
| second: &'a [u8], | ||
| } | ||
|
|
||
| impl<'a> RawSampleData<'a> { | ||
| /// Returns the two slices that make up the sample data. | ||
| pub fn as_slices(&self) -> (&'a [u8], &'a [u8]) { | ||
| (self.first, self.second) | ||
| } | ||
|
|
||
| /// Returns the total length of the sample data. | ||
| pub fn len(&self) -> usize { | ||
| self.first.len() + self.second.len() | ||
| } | ||
|
|
||
| /// Returns true if the sample data is contiguous. | ||
| pub fn is_contiguous(&self) -> bool { | ||
| self.second.is_empty() | ||
| } | ||
| } |
There was a problem hiding this comment.
The is_empty method is missing from RawSampleData. The type provides len() and is_contiguous(), but lacks the standard is_empty() method that would be expected on a collection-like type. This would be more consistent with Rust conventions.
| x if x == PERF_RECORD_LOST as u32 => { | ||
| if event_size < header_size + mem::size_of::<u64>() * 2 { | ||
| self.tail = self.head; | ||
| return None; | ||
| } | ||
| let count = self.read_u64(event_start + header_size + mem::size_of::<u64>()); | ||
| self.events.lost += count as usize; | ||
| self.tail += event_size; |
There was a problem hiding this comment.
Missing test coverage for PERF_RECORD_LOST events in the raw API. While the implementation handles lost events at lines 235-242, and there's a test for lost events using the regular read_events API (test_read_first_lost at line 675), there's no corresponding test that exercises the lost event handling path for read_events_raw.
| use super::{Events, PerfBuffer, PerfBufferError, RawEvents}; | ||
| use crate::sys::{Syscall, TEST_MMAP_RET, override_syscall}; | ||
|
|
||
| /// Test-only perf buffer backed by a fake mmap region. |
There was a problem hiding this comment.
The module and type are named "bench" and "BenchBuffer" but the documentation says "Test-only perf buffer". The naming suggests this is for benchmarks (not tests), while the documentation suggests it's for tests. Either the name should be TestBuffer or the documentation should say "Benchmark-only perf buffer".
| /// Test-only perf buffer backed by a fake mmap region. | |
| /// Benchmark-only perf buffer backed by a fake mmap region. |
| /// Returns the two slices that make up the sample data. | ||
| pub fn as_slices(&self) -> (&'a [u8], &'a [u8]) { | ||
| (self.first, self.second) | ||
| } |
There was a problem hiding this comment.
The documentation for RawSampleData and as_slices() doesn't explain why the sample data is split into two slices. This is due to the ring buffer wrapping, and documenting this would help users understand when the second slice would be non-empty. Consider adding documentation like "Returns two slices making up the sample data. When the sample wraps around the ring buffer, the first slice contains the data before wrapping and the second contains the data after wrapping. Use is_contiguous() to check if the data is in a single slice."
This change is